Goto

Collaborating Authors

 fid score




Supplement to Amortized Projection Optimization for Sliced Wasserstein Generative Models

Neural Information Processing Systems

PRW can be seen as the generalization of Max-SW since PRW with k =1 is equivalent to Max-SW. Similar to Max-SW, the optimization of PRW is solved by using projected gradient ascent. The detailed of the algorithm is given in Algorithm 4. We would like to recall that other methods of optimization have also been used to solved PRW such as Riemannian optimization [28], block coordinate descent [21]. However, in this paper, we consider the original and simplest method which is projected gradient ascent.


Supplementary: Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning AAnalyzing the model bias for selecting train-test splits

Neural Information Processing Systems

Values are normalized for comparability of FID progression, as FID scores are not upper bounded and as such, absolute values for different networks and pretraining methods differ. To analyze the impact of the network architecture, pretraining method and training data, respectively the learned feature representations, on the construction of train-test splits and the entailed difficulties, we repeat our class swapping and removal procedure introduced in Section 3 in the main paper using different self-supervised models. Subsequently, we select train-test splits from the same iteration steps. Figure 1 compares the progression of distribution shifts based on FID scores normalized to the [0,1] interval for valid comparison. We observe that across all pretrained models, the general FID progressions and sampled train-test splits exhibit very similar learning problem difficulties, indicating that our sampling procedure is robust to the choice of readily available, state-of-the art self-supervised pretrained models.